Applying fallback to prosodic unit selection from a small imitation database

نویسنده

  • Joram Meron
چکیده

This paper presents an extension to a previous work [1], which used an imitation speech database and a prosodic unit selection algorithm, for improving the naturalness of synthesized speech. The basic approach of the system is to combine a rule-generated prosody with a corpus based prosody module, trying to retain both the robustness of the rule prosody, and the naturalness of the human recorded speech units. This combination was achieved by using a database of imitation speech, enabling a higher level of annotation, which is used by a dynamic unit selection algorithm. Although listeners have been shown to prefer the prosody generated with this method over that of the original rule generated prosody, the usual problems related to selection from an undersized training corpus were occasionally present. Instead of increasing the size of the training database, a different solution is investigated here, which is to perform a controlled fallback to the rule prosody, but in a way which is compatible with the unit selection approach. The suggested method has a minimal effect on the required memory size and the amount of computation, and was shown to produce favorable results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosodic unit selection using an imitation speech database

Starting with a rule based prosody generation system, we try to improve the naturalness of the generated prosody by using a corpus based approach, without losing the advantages of the rule based method. To achieve this, a prosodic unit selection method is introduced, which is similar in its approach to the waveform unit selection used by large unit inventory waveform concatenation systems. Tryi...

متن کامل

طراحی و ارزیابی یک مدل بازسازی گفتار به روش هم‌گذاری واحدهای حساس به بافت نوایی

This paper describes the design and evaluation of prosodically-sensitive concatenative units for a Persian text-to-speech (TTS) synthesis system. Thesyllables used are prosodically conditioned in the sense that a single conventional syllable is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences. The three levels of the Per...

متن کامل

Prosodically modifying speech for unit selection speech synthesis databases

This paper investigates the practical limits of artificially increasing the prosodic richness of a unit selection database by transforming the prosodic realization of constituent sentences. The resulting high-quality transformed sentences are added to the database as new material. We examine in detail one of the most challenging prosodic transformations, namely converting statements into yes/no...

متن کامل

Unit Selection Speech Synthesis Using Phonetic-Prosodic Description of Speech Databases

This paper describes an approach to speech synthesis based on using speech databases at different stages of TTS process. Speech database units are phones in different segmental and prosodic contexts. Pitch synchronous segmentation and labeling of databases allows storing both segmental and prosodic information. Phonetic-prosodic annotations of speech databases are involved in off-line training ...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002